Intrinsic dimension estimation by maximum likelihood in isotropic probabilistic PCA

نویسندگان

  • Charles Bouveyron
  • Gilles Celeux
  • Stéphane Girard
چکیده

A central issue in dimension reduction is choosing a sensible number of dimensions to be retained. This work demonstrates the surprising result of the asymptotic consistency of the maximum likelihood criterion for determining the intrinsic dimension of a dataset in an isotropic version of Probabilistic Principal Component Analysis (PPCA). Numerical experiments on simulated and real datasets show that the maximum likelihood criterion can actually be used in practice and outperforms existing intrinsic dimension selection criteria in various situations. This paper exhibits and outlines the limits of the maximum likelihood criterion. It leads to recommend the use of the AIC criterion in specific situations. A useful application of this work would be the automatic selection of intrinsic dimensions in mixtures of isotropic PPCA for classification.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Intrinsic Dimension Estimation by Maximum Likelihood in Probabilistic PCA

A central issue in dimension reduction is choosing a sensible number of dimensions to be retained. This work demonstrates the asymptotic consistency of the maximum likelihood criterion for determining the intrinsic dimension of a dataset in a isotropic version of Probabilistic Principal Component Analysis (PPCA). Numerical experiments on simulated and real datasets show that the maximum likelih...

متن کامل

Probabilistic PCA for t distributions

Principal component analysis (PCA) is a popular technique for dimension reduction. Since the scope of its application is limited by its global linearity, several generalizations are proposed in the literature, among which the probabilistic PCA, introduced recently by Tipping and Bishop, is a particularly important one. Based on a probabilistic model, these authors obtained a PCA type projection...

متن کامل

Regularized Maximum Likelihood for Intrinsic Dimension Estimation

We propose a new method for estimating the intrinsic dimension of a dataset by applying the principle of regularized maximum likelihood to the distances between close neighbors. We propose a regularization scheme which is motivated by divergence minimization principles. We derive the estimator by a Poisson process approximation, argue about its convergence properties and apply it to a number of...

متن کامل

Maximum Likelihood Estimation of Intrinsic Dimension

We propose a new method for estimating intrinsic dimension of a dataset derived by applying the principle of maximum likelihood to the distances between close neighbors. We derive the estimator by a Poisson process approximation, assess its bias and variance theoretically and by simulations, and apply it to a number of simulated and real datasets. We also show it has the best overall performanc...

متن کامل

Intrinsic Dimensionality Estimation in Visualizing Toxicity Data

Over the years, a number of dimensionality reduction techniques have been proposed and used in chemo informatics to perform nonlinear mappings. Nevertheless, data visualization techniques can be efficiently applied for dimensionality reduction mainly in a case if the data are not really high-dimensional and can be represented as a nonlinear low-dimensional manifold when it is possible to reduce...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 32  شماره 

صفحات  -

تاریخ انتشار 2011